Recycling Terms into a Partial Parser
نویسنده
چکیده
Both full-text information retrieval and large scale parsing require text preprocessing to identify strong lexical associations in textual databases. In order to associate linguistic felicity with computational efficiency, we have conceived FASTR a unification-based parser supporting large textual and grammatical databases. The grammar is composed of term rules obtained by tagging and lemmatizing term lists with an online dict ionary. Through F A S T R , large terminological data can be recycled for text processing purposes. Great stress is placed on the handling of term variations through metarules which relate basic terms to their semantically close morphosyntactic variants. The quality of terminological extraction and the computational efficiency of FASTR are evaluated through a joint experiment with an industrial documentation center. The processing of two large technical corpora shows that the application is scalable to such industrial data and that accounting for term variants results in an increase of recall by 20%. Although automatic indexing is the most straightforward application of FASTR, it can be extended fruitfully to terminological acquisition and compound interpretation.
منابع مشابه
Evaluation Results of Concept-based Translator with Partial Parsing
In this paper, we describe an evaluation of the output of the translator using conceptbased grammars. This translator translates the Korean sentence generated by a speech recognizer into an English sentence through a concept analysis approach. A partial parsing function added to the translator and obtained better improvement because the performance of the parser (whole parser) is low in the sta...
متن کاملFeature Engineering in Persian Dependency Parser
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...
متن کاملThe Effect of Task Repetition and Task Recycling on EFL Learners' Oral Performance
One of the major criticisms leveled at task-based language teaching (TBLT), despite its countless merits, is developing fluency at the cost of accuracy. The post-task stage affords a number of options to counteract this downside through task repetition and task recycling. These two options are considered to positively affect learners' oral performance in terms of fluency, accuracy, and complexi...
متن کاملPartial Training for a Lexicalized-Grammar Parser
We propose a solution to the annotation bottleneck for statistical parsing, by exploiting the lexicalized nature of Combinatory Categorial Grammar (CCG). The parsing model uses predicate-argument dependencies for training, which are derived from sequences of CCG lexical categories rather than full derivations. A simple method is used for extracting dependencies from lexical category sequences, ...
متن کاملAlbany: A Component-Based Partial Differential Equation Code Built on Trilinos
Discretization Application Linear Solve Load Balancing Input Parser PDE Terms, BCs, Responses Libraries
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994